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1. INTRODUCTION 

This manual describes the C language on the DEC PDP-11, the DEC VAX-11/780, the Honeywell 
6000, the IBM System/370, and the Interdata 8/32. Where differences exist, it concentrates on the PDP- 
11, but tries to point out implementation-dependent details. With few exceptions, such dependencies fol- 
low directly from the properties of the hardware; the various compilers are generally quite compatible. 


2. LEXICAL CONVENTIONS 

There are six classes of tokens: identifiers, keywords, constants, strings, operators, and other separa- 
tors. Blanks, tabs, new-lines, and comments (collectively, “white space’’) as described below are ignored 
except as they serve to separate tokens. Some white space is required to separate otherwise adjacent 
identifiers, keywords, and constants. 

If the input stream has been parsed into tokens up to a given character, the next token is taken to 
include the longest string of characters which could possibly constitute a token. 


2.1 Comments 
The characters /* introduce a comment, which terminates with the characters *«/. Comments do 
not nest. 


2.2 Identifiers (Names) 

An identifier is a sequence of letters and digits; the first character must be a letter; the underscore _ 
counts as a letter. Upper- and lower-case letters are different. No more than the first eight characters are 
significant, although more may be used. External identifiers, which are used by various assemblers and 
loaders, are more restricted: 


DEC PDP-11 7 characters, 2 cases 
DEC VAX-11 7 characters, 2 cases 
Honeywell 6000 6 characters, 1 case 
IBM 360/370 7 characters, 1 case 
Interdata 8/32 8 characters, 2 cases 


2.3 Keywords 
The following identifiers are reserved for use as keywords, and may not be used otherwise: 


auto do float register switch 
break double for return typedef 
case else goto short union 
char entry p Be sizeof unsigned 
continue enum int static void 
default extern long struct while 


The entry keyword is not currently implemented by any compiler but is reserved for future use. Some 
implementations also reserve the words fortran and asm. 


* This manual is reprinted, with minor changes, from The C Programming Language by Brian W. Kernighan 
and Dennis M. Ritchie, Prentice Hall, Inc., 1978. It specifies the language definition as of September, 1980. 
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2.4 Constants ; 
There are several kinds of constants, as listed below. Hardware characteristics that affect sizes are 
summarized in §2.6. 


2.4.1 Integer constants 

An integer constant consisting of a sequence of digits is taken to be octal if it begins with 0 (digit 
zero), decimal otherwise. A sequence of digits preceded by Ox or OX (digit zero) is taken to be a hexa- 
decimal integer. The hexadecimal digits include a or A through f or F with values 10 through 15. A 
decimal constant whose value exceeds the largest signed machine integer is taken to be long; an octal or 
hex constant which exceeds the largest unsigned machine integer is likewise taken to be long. 


2.4.2 Explicit long constants 

A decimal, octal, or hexadecimal integer constant immediately followed by 1 (letter ell) or L is a 
long constant. As discussed below, on some machines integer and long values may be considered identi- 
cal. 


2.4.3 Character constants 

A character constant is a character enclosed in single quotes, as in ‘x’. The value of a character 
constant is the numerical value of the character in the machine’s character set. 

Certain non-graphic characters, the single quote ’ and the backslash \, may be represented according 
to the following table of escape sequences: 


new-line NL(LF) \n 
horizontal tab HT Nt 
vertical tab VT \v 
backspace BS \b 
carriage return CR \r 
form feed FF \f 
backslash x \\ 
single quote $ ae 
bit pattern ddd \ddd 


The escape \ddd consists of the backslash followed by 1, 2, or 3 octal digits which are taken to specify the 
value of the desired character. A special case of this construction is \0 (not followed by a digit), which 
indicates the character NUL. If the character following a backslash is not one of those specified, the 
backslash is ignored. 


2.4.4 Floating constants 

A floating constant consists of an integer part, a decimal point, a fraction part, an e or E, and an 
optionally signed integer exponent. The integer and fraction parts both consist of a sequence of digits. 
Either the integer part or the fraction part (not both) may be missing; either the decimal point or the e 
and the exponent (not both) may be missing. Every floating constant is taken to be double-precision. 


2.4.5 Enumeration constants 
Names declared as enumerators (see §8.5) are constants of the corresponding enumeration type. They 
behave like int constants. 


2.5 Strings 

A string is a sequence of characters surrounded by double quotes, as in "...". A string has type 
“‘array of characters’’ and storage class static (see §4 below) and is initialized with the given charac- 
ters. All strings, even when written identically, are distinct. The compiler places a null byte \0 at the 
end of each string so that programs which scan the string can find its end. In a string, the double quote 
character " must be preceded by a \; in addition, the same escapes as described for character constants 
* may be used. Finally, a \ and the immediately following new-line are ignored. 
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2.6 Hardware characteristics 
The following table summarizes certain hardware properties that vary from machine to machine. 


DEC PDP-11 DEC VAX-11 Honeywell 6000 IBM 370 _ Interdata 8/32 
ASCII ASCII ASCIL EBCDIC ASCII 
char 8 bits 8 bits 9 bits 8 bits 8 bits 
int 16 32 36 32 32 


short 16 16 36 16 16 
long 32, 32 36 32 32 
float 32 32 36 32 32 
double 64 64 72 64 64 
range +103? + 1022" eet hi fess eh Uae ooh Keke 


3. SYNTAX NOTATION 

In the syntax notation used in this manual, syntactic categories are indicated by italic type, and literal 
words and characters in constant-width type. Alternative categories are listed on separate lines. An 
optional terminal or non-terminal symbol is indicated by the subscript ‘‘opt,’’ so that 


{ expression,, } 


indicates an optional expression enclosed in braces. The syntax is summarized in §18. 


4. WHAT’S IN A NAME? 

C bases the interpretation of an identifier upon two attributes of the identifier: its storage class and its 
type. The storage class determines the location and lifetime of the storage associated with an identifier; 
the type determines the meaning of the values found in the identifier’s storage. 

There are four declarable storage classes: automatic, static, external, and register. Automatic vari- 
ables are local to each invocation of a block (§9.2), and are discarded upon exit from the block; static 
variables are local to a block, but retain their values upon reentry to a block even after control has left 
the block; external variables exist and retain their values throughout the execution of the entire program, 
and may be used for communication between functions, even separately compiled functions. Register 
variables are (if possible) stored in the fast registers of the machine; like automatic variables they are 
local to each block and disappear on exit from the block. 

C supports several fundamental types of objects: 

Objects declared as characters (char) are large enough to store any member of the implementation’s 
character set, and if a genuine character from that character set is stored in a character variable, its value 
is equivalent to the integer code for that character. Other quantities may be stored into character vari- 
ables, but the implementation is machine-dependent. 

Up to three sizes of integer, declared short int, int, and long int, are available. Longer 
integers provide no less storage than shorter ones, but the implementation may make cither short 
integers, or long integers, or both, equivalent to plain integers. ‘‘Plain’’ integers have the natural size 
suggested by the host machine architecture; the other sizes are provided to meet special needs. 

Each enumeration (§8.5) is conceptually a separate type with its own set of named constants. The 
properties of an enum type are identical to those of int type. 

Unsigned integers, declared unsigned, obey the laws of arithmetic modulo 2” where a is the 
number of bits in the representation. (On the PDP-11, unsigned long quantities are not supported.) 

Single-precision floating point (float) and double-precision floating point (double) may be 
synonymous in some implementations. 

Because objects of the foregoing types can usefully be interpreted as numbers, they will be referred 
to as arithmetic types. Types char, int of all sizes, and enum will collectively be called integral types. 
float and double will collectively be called floating types. 

The void type specifies an empty set of values. It is used as the type returned by functions that 
generate no value. 

Besides the fundamental arithmetic types there is a conceptually infinite class of derived types con- 


‘structed from the fundamental types in the following ways: 
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arrays of objects of most types; 

functions which return objects of a given type; 

pointers to objects of a given type; 

structures containing a sequence of objects of various types; 

unions capable of containing any one of several objects of various types. 
In general these methods of constructing objects can be applied recursively. 


5. OBJECTS AND LVALUES 

An object is a manipulatable region of storage; an /value is an expression referring to an object. An 
obvious example of an lvalue expression is an identifier. There are operators which yield lvalues: for 
example, if E is an expression of pointer type, then *E is an Ivalue expression referring to the object to 
which E points. The name “‘lvalue’’ comes from the assignment expression E1 = E2 in which the left 
operand E1 must be an lvalue expression. The discussion of each operator below indicates whether it 
expects Ivalue operands and whether it yields an lvalue. 


6. CONVERSIONS 

A number of operators may, depending on their operands, cause conversion of the value of an 
operand from one type to another. This section explains the result to be expected from such conver- 
sions. §6.6 summarizes the conversions demanded by most ordinary operators; it will be supplemented as 
required by the discussion of each operator. 


6.1 Characters and integers 

A character or a short integer may be used wherever an integer may be used. In all cases the value 
is converted to an integer. Conversion of a shorter integer to a longer always involves sign extension; 
integers are signed quantities. Whether or not sign-extension occurs for characters is machine dependent, 
but it is guaranteed that a member of the standard character set is non-negative. Of the machines treated 
here, only the PDP-11 and VAX-11 sign-extend. On these machines, char variables range in value from 
—128 to 127. The more explicit type unsigned char forces the values to range from 0 to 255. 

On machines that treat characters as signed, the characters of the ASCII set are all positive. How- 
ever, a character constant specified with an octal escape suffers sign extension and may appear negative; 
for example, ‘\ 377% has the value -1. 

When a longer integer is converted to a shorter or to a char, it is truncated on the left; excess bits 
are simply discarded. 


6.2 Float and double 

All floating arithmetic in C is carried out in double-precision; whenever a float appears in an 
expression it is lengthened to double by zero-padding its fraction. When a double must be converted 
to float, for example by an assignment, the doubl1e is rounded before truncation to float length. 


6.3 Floating and integral 

Conversions of floating valucs to integral type tend to be rather machine-dependent; in particular the 
direction of truncation of negative numbers varies from machine to machine. The result is undefined if 
the value will not fit in the space provided. 

Conversions of integral values to floating type are well behaved. Some loss of precision occurs if the 
destination lacks sufficient bits. 


6.4 Pointers and integers 

An expression of integral type may be added to or subtracted from a pointer; in such a case the first 
is converted as specified in the discussion of the addition operator. 

Two pointers to objects of the same type may be subtracted; in this case the result is converted to an 
integer as specified in the discussion of the subtraction operator. 


6.5 Unsigned 
’ Whenever an unsigned integer and a plain integer are combined, the plain integer is converted to 
unsigned and the result is unsigned. The value is the least unsigned integer congruent to the signed 
integer (modulo 2¥°'4*), In a 2’s complement representation, this conversion is conceptual and there is 
no actual change in the bit pattern. 
When an unsigned integer is converted to long, the value of the result is the same numerically as 
that of the unsigned integer. Thus the conversion amounts to padding with zeros on the left. 


nPmiemrtHeHeRHHHeHEHHHHHe EF 
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6.6 Arithmetic conversions 
A great many operators cause conversions and yield result types in a similar way. This pattern will 
be called the ‘‘usual arithmetic conversions.” 


First, any operands of type char or short are converted to int, and any of type float are con- 
verted to double. 

Then, if either operand is double, the other is converted to double and that is the type of the 
result. 

Otherwise, if either operand is long, the other is converted to long and that is the type of the 
result. 

Otherwise, if cither operand is unsigned, the other is converted to unsigned and that is the 
type of the result. 

Otherwise, both operands must be int, and that is the type of the result. 


6.7 Void 
The (nonexistent) value of a void object may not be used in any way, and neither explicit nor implicit 
conversion may be applied. Because a void expression denotes a nonexistent value, such an expression 
may be used only as an expression statement (§9.1) or as the left operand of a comma expression 
(§7.15). 

An expression may be converted to type void by use of a cast. For example, this makes explicit the 
discarding of the value of a function call used as an expression statement. 


7. EXPRESSIONS 

The precedence of expression operators is the same as the order of the major subsections of this sec- 
tion, highest precedence first. Thus, for example, the expressions referred to as the operands of + (§7.4) 
are those expressions defined in §§7.1-7.3. Within each subsection, the operators have the same pre- 
cedence. Left- or right-associativity is specified in each subsection for the operators discussed thercin. 
The precedence and associativity of all the expression operators is summarized in the grammar of §18. 

Otherwise the order of evaluation of expressions is undefined. In particular the compiler considers 
itself free to compute subexpressions in the order it believes most efficient, even if the subexpressions 
involve side effects. The order in which side effects take place is unspecified. Expressions involving a 
commutative and associative operator (*, +, &, 1, ~) may be rearranged arbitrarily, even in the presence 
of parentheses; to force a particular order of evaluation an explicit temporary must be used. 

The handling of overflow and divide check in expression evaluation is machine-dependent. Most 
existing implementations of C ignore integer overflows; treatment of division by 0, and all floating-point 
exceptions, varies between machines, and is usually adjustable by a library function. 


7.1 Primary expressions 
Primary expressions involving ., ->, subscripting, and function calls group left to right. 


primary-expression: 
identifier 
constant 
string 
( expression ) 
primary-expression ( expression |] 
primary-expression ( expression-list , = ) 
primary-expression . identifier 
primary-expression —> identifier 


expression-list: 
expression 
expression-list , expression 


_An identifier is a primary expression, provided it has been suitably declared as discussed below. Its type 


is specified by its declaration. If the type of the identifier is ‘array of ...”’, however, then the value of 
the identifier-expression is a pointer to the first object in the array, and the type of the expression is 
‘pointer to ...’’.. Moreover, an array identifier is not an Ivalue expression. Likewise, an identifier which 
is declared ‘‘function returning ...’’, when used except in the function-name position of a call, is con- 
verted to “‘pointer to function returning...”. 
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A constant is a primary expression. Its type may be int, long, or double depending on its form. 
Character constants have type int; floating constants are double. 

A string is a primary expression. Its type is originally ‘‘array of char’’; but following the same rule 
given above for identifiers, this is modified to “‘pointer to char’’ and the result is a pointer to the first 
character in the string. (There is an exception in certain initializers; see §8.6.) 

A parenthesized expression is a primary expression whose type and value are identical to those of the 
unadorned expression. The presence of parentheses does not affect whether the expression is an lvalue. 

A primary expression followed by an expression in square brackets is a primary expression. The 
intuitive meaning is that of a subscript. Usually, the primary expression has type ‘‘pointer to ...’’, the 
subscript expression is int, and the type of the result is ‘“*...’’, The expression E1[E2] is identical (by 
definition) to *((E1)+(E2)). All the clues needed to understand this notation are contained in this 
section together with the discussions in §§ 7.1, 7.2, and 7.4 on identifiers, *, and + respectively; §14.3 
below summarizes the implications. 

A function call is a primary expression followed by parentheses containing a possibly empty, 
comma-separated list of expressions which constitute the actual arguments to the function. The primary 
expression must be of type “function returning ...’’, and the result of the function call is of type *‘...’’. 
As indicated below, a hitherto unscen identifier followed immediately by a left parenthesis is contextually 
declared to represent a function returning an integer; thus in the most common case, integer-valued 
functions need not be declared. 

Any actual arguments of type float are converted to double before the call; any of type char or 
short are converted to int; and as usual, array names are converted to pointers. No other conver- 
sions are performed automatically; in particular, the compiler does not compare the types of actual argu- 
ments with those of formal arguments. If conversion is needed, use a cast; see §7.2, 8.7. 

In preparing for the call to a function, a copy is made of each actual parameter; thus, all argument- 
passing in C is strictly by value. A function may change the values of its formal parameters, but these 
changes cannot affect the values of the actual parameters. On the other hand, it is possible to pass a 
pointer on the understanding that the function may change the value of the object to which the pointer 
points. An array name is a pointer expression. The order of evaluation of arguments is undefined by the 
language; take note that the various compilers differ. 

Recursive calls to any function are permitted. 

A primary expression followed by a dot followed by an identifier is an expression. The first expres- 
sion must be a structure or a union, and the identifier must name a member of the structute or union. 
The value is the named member of the structure or union, and it is an lvalue if the first expression is an 
Ivalue. 

A primary expression followed by an arrow (built from a - and a >) followed by an identifier is an 
expression, The first expression must be a pointer to a structure or a union and the identifier must name 
a member of that structure or union. The result is an lvalue referring to the named member of the struc- 
ture or union to which the pointer expression points. Thus the expression E1->MOS is the same as 
(*E1).MOS. Structures and unions are discussed in §8.5. 


7.2 Unary operators 
Expressions with unary operators group right-to-left. 


unary-expression:; 

* expression 

& lvalue 

- expression 

! expression 

~ expression 

++ lvalue 

-- lvalue 

lvalue ++ 

lvalue -- 

( type-name ) expression 

sizeof expression 

sizeof ( lype-name ) 
The unary * operator means indirection: the expression must be a pointer, and the result is an lvalue 
referring to the object to which the expression points. If the type of the expression is ‘‘pointer to ...’’, 
the type of the result is “*...” 
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The result of the unary & operator is a pointer to the object referred to by the lvaluc. If the type of 
the lvalue is ‘*...’’, the type of the result is ‘‘pointer to ...”’. 

The result of the unary - operator is the negative of its operand. The usual arithmetic conversions 
are performed. The negative of an unsigned quantity is computed by subtracting its value from 2", where 
n is the number of bits in an int. There is no unary + operator. 

The result of the logical negation operator ! is 1 if the value of its operand is 0, 0 if the value of its 
operand is non-zero. The type of the result is int. It is applicable to any arithmetic type or to pointers. 

The ~ operator yields the one’s complement of its operand. The usual arithmetic conversions are 
performed, The type of the operand must be integral. 

The object referred to by the lvalue operand of prefix ++ is incremented. The value is the new value 
of the operand, but is not an Ivalue. The expression ++x is equivalent to x+=1. Sce the discussions of 
addition (§7.4) and assignment operators (§7.14) for information on conversions. 

The lvalue operand of prefix -- is decremented analogously to the prefix ++ operator. 

When postfix ++ is applied to an lvalue the result is the value of the object referred to by the value. 
After the result is noted, the object is incremented in the same manner as for the prefix ++ operator. 
The type of the result is the same as the type of the lvalue expression. 

When postfix -- is applicd to an lvalue the result is the value of the object referred to by the lvalue. 
After the result is noted, the object is decremented in the manner as for the prefix -- operator. The 
type of the result is the same as the type of the lvalue expression. 

An expression preceded by the parenthesized name of a data type causes conversion of the value of 
the expression to the named type. This construction is called a cast. Type names are described in §8.7. 

The sizeof operator yields the size, in bytes, of its operand. (A byte is undefined by the language 
except in terms of the value of sizeof. However, in all existing implementations a byte is the space 
required to hold a char.) When applied to an array, the result is the total number of bytes in the array. 
The size is determined from the declarations of the objects in the expression. This expression is semanti- 
cally an unsigned constant* and may be used anywhere a constant is required. Its major use is in com- 
munication with routines like storage allocators and I/O systems. 

The sizeof operator may also be applied to a parenthesized type name. In that case it yields the 
size, in bytes, of an object of the indicated type. 

The construction sizeof (type ) is taken to be a unit, so the expression sizeof (wpe )-2 is the 
same as (sizeof (type) )-2. 


7.3 Multiplicative operators 
The multiplicative operators *, 7, and % group left-to-right. The usual arithmetic conversions are 
performed, 


mutltiplicative-expression: 
expression * expression 
expression / expression 
expression % expression 


The binary * operator indicates multiplication. The * operator is associative and expressions with 
several multiplications at the same level may be rearranged by the compiler. 

The binary / operator indicates division. When positive integers are divided truncation is toward 0, 
but the form of truncation is machine-dependent if either operand is negative. On all machines covered 
by this manual, the remainder has the same sign as the dividend. It is always true that 
(a/b)*b + a%b is equal to a (if b is not 0). 

The binary % operator yields the remainder from the division of the first expression by the second. 
The usual arithmetic conversions are performed. The operands must not be float. 


7.4 Additive operators 
The additive operators + and - group left-to-right. The usual arithmetic conversions are performed. 
There are some additional type possibilities for each operator. 


* As of this writing, sizeof expressions are unsigned only for the PDP-1! compiler; other compilers treat 
them as integers. 
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additive-expression: 
expression + expression 
expression — expression 


The result of the + operator is the sum of the operands. A pointer to an object in an array and a value of 
any integral type may be added. The latter is in all cases converted to an address offset by multiplying it 
by the length of the object to which the pointer points. The result is a pointer of the same type as the 
original pointer, and which points to another object in the same array, appropriately offset from the origi- 
nal object. Thus if P is a pointer to an object in an array, the expression P+1 is a pointer to the next 
object in the array. 

No further type combinations are allowed for pointers. 

The + operator is associative and expressions with several additions at the same level may be rear- 
ranged by the compiler. 

The result of the - operator is the difference of the operands. The usual arithmetic conversions are 
performed. Additionally, a value of any integral type may be subtracted from a pointer, and then the 
same conversions as for addition apply. 

If two pointers to objects of the same type are subtracted, the result is converted (by division by the 
length of the object) to an int representing the number of objects separating the pointed-to objects. 
This conversion will in general give unexpected results unless the pointers point to objects in the same 
array, since pointers, even to objects of the same type, do not necessarily differ by a multiple of the 
object-length. 


7.5 Shift operators 

The shift operators << and >> group left-to-right. Both perform the usual arithmetic conversions on 
their operands, each of which must be integral. Then the right operand is converted to int; the type of 
the result is that of the left operand. The result is undefined if the right operand is negative, or greater 
than or equal to the length of the object in bits. 


shift-expression: 
expression << expression 
expression >> expression 


The value of E1<<E2 is E1 (interpreted as a bit pattern) left-shifted E2 bits; vacated bits are 0-filled. 
The value of E1>>E2 is E1 right-shifted E2 bit positions. The right shift is guaranteed to be logical (0- 
fill) if £1 is unsigned; otherwise it may be arithmetic (fill by a copy of the sign bit). 


7.6 Relational operators 
The relational operators group left-to-right, but this fact is not very useful; a<b<c does not mean 
what it seems to. 


relational-expression: 
expression < expression 
expression > expression 
expression <= expression 
expression >= expression 


The operators < (less than), > (greater than), <= (less than or equal to) and >= (greater than or equal 
to) all yield 0 if the specified relation is false and | if it is true. The type of the result is int. The usual 
arithmetic conversions are performed. Two pointers may be compared; the result depends on the relative 
locations in the address space of the pointed-to objects. Pointer comparison is portable only when the 
pointers point to objects in the same array. 


7.7 Equality operators 


equality-expression: 
expression == expression 
expression |= expression 


The == (equal to) and the != (not equal to) operators are exactly analogous to the relational opera- 
tors except for their lower precedence. (Thus a<b == e¢<d is 1 whenever a<b and c<d have the same 
truth-value). 
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A pointer may be compared to an, integer only if the integer is the constant 0. A pointer to which 0 
has been assigned is guaranteed not to point to any object, and will appear to be equal to 0; in conven- 
tional usage, such a pointer is considered to be null. 


7.8 Bitwise AND operator 


and-expression: 
expression & expression 


The & operator is associative and expressions involving & may be rearranged. The usual arithmetic 
conversions are performed; the result is the bitwise AND function of the operands. The operator applies 
only to integral operands. 


7.9 Bitwise exclusive OR operator 


exclusive-or-expression: 
expression * expression 


a 


The * operator is associative and expressions involving “ may be rearranged. The usual arithmetic 
conversions are performed; the result is the bitwise exclusive OR function of the operands. The operator 
applies only to integral operands. 


7.10 Bitwise inclusive OR operator 


inclusive-or-expression: 


expression | expression 


The | operator is associative and expressions involving | may be rearranged. The usual arithmetic 
conversions are performed; the result is the bitwise inclusive OR function of its operands. The operator 
applies only to integral operands. 


7.11 Logical AND operator 


logical-and-expression: 
expression && expression 


The && operator groups left-to-right. It returns 1 if both its operands are non-zero, 0 otherwise. 
Unlike &, && guarantees left-to-right evaluation; moreover the second operand is not evaluated if the 
first operand is 0. 

The operands need not have the same type, but each must have one of the fundamental types or be 
a pointer. The result is always int. 


7.12 Logical OR operator 


logical-or-expression: 


expression | \ expression 


The {i operator groups left-to-right. It returns 1 if either of its operands is non-zero, and 0 other- 
wise. Unlike |, |i guarantees left-to-right evaluation; moreover, the second operand is not evaluated if 
the value of the first operand is non-zero. 

The operands need not have the same type, but each must have one of the fundamental types or be 
a pointer. The result is always int. 


7.13 Conditional operator 


conditional-expression: 
expression ? expression : expression 


Conditional expressions group right-to-left. The first expression is evaluated and if it is non-zero, 


‘the result is the value of the second expression, otherwise that of third expression. If possible, the usual 


arithmetic conversions are performed to bring the second and third expressions to a common type; other- 
wise, if both are pointers of the same type, the result has the common type; otherwise, one must be a 
pointer and the other the constant 0, and the result has the type of the pointer. Only one of the second 
and third expressions is evaluated. 
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7.14 Assignment operators ; 

There are a number of assignment operators, all of which group right-to-left. All require an lvalue as 
their left operand, and the type of an assignment expression is that of its left operand. The value is the 
value stored in the left operand after the assignment has taken place. The two parts of a compound 
assignment operator are separate tokens. 


assignment-expression: 

lvalue = expression 
lvalue += expression 
lvalue -= expression 
lvalue *= expression 
lvalue /= expression 
lvalue %= expression 
lvalue > >= expression 
lvalue «<= expression 
lvalue &= expression 
lvalue ~ = expression 
lvalue \ = expression 


In the simple assignment with =, the value of the expression replaces that of the object referred to 
by the Ivaluc. If both operands have arithmetic type, the right operand is converted to the type of the 
left preparatory to the assignment. Second, both operands may be structures or unions of the same type. 
Finally, if the left operand is a pointer, the right operand must in general be a pointer of the same type; 
however the constant 0 may be assigned to a pointer, and it is guaranteed that this value will produce a 
null pointer distinguishable from a pointer to any object. 

The. behavior of an expression of the form E1 op= E2 may be inferred by taking it as equivalent to 
E1 = E1 op (£2); however, E1 is evaluated only once. In += and -=, the left operand may be a 
pointer, in which case the (integral) right operand is converted as explained in §7.4; all right operands 
and all non-pointer left operands must have arithmetic type. 


7.15 Comma operator 


comma-expression: 
expression , expression 


A pair of expressions separated by a comma is evaluated left-to-right and the value of the left 
expression is discarded. The type and value of the result are the type and value of the right operand. 
This operator groups left-to-right. In contexts where comma is given a special meaning, for example in 
lists of actual arguments to functions (§7.1) and lists of initializers (§8.6), the comma operator as 
described in this section can only appear in parentheses; for example, 


Pag. Chas. free) 


has three arguments, the second of which has the value 5. 


8. DECLARATIONS 
Declarations are used to specify the interpretation which C gives to each identifier; they do not 
necessarily reserve storage associated with the identifier. Declarations have the form 


declaration: 

decl-specifiers declarator-list, a 
The declarators in the declarator-list contain the identifiers being declared. The decl-specifiers consist of a 
sequence of type and storage class specifiers. 


decl-specifiers: 
type-specifier decl-specifiers = 
sc-specifier decl-specifiers et 


The list must be self-consistent in a way described below. 
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8.1 Storage class specifiers 
The se-specifiers are: 


se-specifier: 
auto 
static 
extern 
register 
typedef 


The typedef specifier does not reserve storage and is called a “‘storage class specifier’’ only for syntac- 
tic convenience; it is discussed in §8.8. The meanings of the various storage classes were discussed in §4. 

The auto, static, and register declarations also serve as definitions in that they cause an 
appropriate amount of storage to be reserved. In the extern case there must be an external definition 
(§10) for the given identifiers somewhere outside the function in which they are declared. 

A register declaration is best thought of as an auto declaration, together with a hint to the com- 
piler that the variables declared will be heavily used. Only the first few such declarations are effective. 
Moreover, only variables of certain types will be stored in registers; on the PDP-11, they are int or 
pointer. One other restriction applies to register variables: the address-of operator & cannot be applied to 
them. Smaller, faster programs can be expected if register declarations are used appropriately, but future 
improvements in code generation may render them unnecessary. 

At most one sc-specifier may be given in a declaration. If the sc-specifier is missing from a declara- 
tion, it is taken to be auto inside a function, extern outside. Exception: functions are never 
automatic. 


8.2 Type specifiers 
The type-specifiers are 


type-specifier: 
char 
short 
int 
long 
unsigned 
float 
double 
void 
struct-or-union-specifier 
typedef-name 
enum-specifier 


The words long, short, and unsigned may be thought of as adjectives; the following combinations 
are acceptable. 


short int 
long int 
unsigned int 
unsigned char 
long float 


The meaning of the last is the same as double. Otherwise, at most one type-specifier may be given in a 
declaration. If the type-specifier is missing from a declaration, it is taken to be int. 

Specifiers for structures, unions and enumerations are discussed in §8.5; declarations with typedef 
names are discussed in §8.8. 


8.3 Declarators 
The declarator-list appearing in a declaration is a comma-separated sequence of declarators, each of 
which may have an initializer. 
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declarator-list: 
init-declarator 
init-declarator , declarator-list 


init-declarator: 


declarator initializer, 


Initializers are discussed in §8.6. The specifiers in the declaration indicate the type and storage class of 
the objects to which the declarators refer. Declarators have the syntax: 


declarator: 
identifier 
( declarator ) 
* declarator 
declarator ( ) 


declarator [ constant-expression,, ] 


The grouping is the same as in expressions. 


8.4 Meaning of declarators 

Each declarator is taken to be an assertion that when a construction of the same form as the declara- 
tor appears in an expression, it yields an object of the indicated type and storage class. Each declarator 
contains exactly one identifier; it is this identifier that is declared. 

If an unadorned identifier appears as a declarator, then it has the type indicated by the specifier head- 
ing the declaration. 

A declarator in parentheses is identical to the unadorned declarator, but the binding of complex 
declarators may be altered by parentheses. See the examples below. 

Now imagine a declaration 


dae 8 


where T is a type-specifier (like int, etc.) and D1 is a declarator. Suppose this declaration makes the 
identifier have type ‘*... T,’’ where the “‘...’’ is empty if D1 is just a plain identifier (so that the type of 
xin “int x’’ is just int). Then if D1 has the form 


*D 


the type of the contained identifier is “‘... pointer to T.”’ 
If D1 has the form 


D() 


then the contained identifier has the type “‘... function returning T.”’ 
If D1 has the form 


D[constant-expression ] 


or 
D[] 


ae 


then the contained identifier has type . array of T.’’ In the first case the constant expression is an 
expression whose value is determinable at compile time, and whose type is int. (Constant expressions 
are defined precisely in §15.) When several ‘‘array of’’ specifications are adjacent, a multi-dimensional 
array is created; the constant expressions which specify the bounds of the arrays may be missing only for 
the first member of the sequence. This elision is useful when the array is external and the actual 
definition, which allocates storage, is given elsewhere. The first constant-expression may also be omitted 
when the declarator is followed by initialization. In this case the size is calculated from the number of 
initial elements supplied. 

An array may be constructed from one of the basic types, from a pointer, from a structure or union, 
or from another array (to generate a multi-dimensional array). 

Not all the possibilities allowed by the syntax above are actually permitted. The restrictions are as 
follows: functions may not return arrays or functions, although they may return pointers to such things; 
there are no arrays of functions, although there may be arrays of pointers to functions. Likewise a struc- 
ture or union may not contain a function, but it may contain a pointer to a function. 
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As an example, the declaration 
anti, “#ip, £0); sfip0). Cepet.c); 


declares an integer i, a pointer ip to an integer, a function £ returning an integer, a function fip 
returning a pointer to an integer, and a pointer p£i to a function which returns an integer. It is espe- 
cially useful to compare the last two. The binding of *«fip() is *(fip() ), so that the declaration sug- 
gests, and the same construction in an expression requires, the calling of a function fip, and then using 
indirection through the (pointer) result to yield an integer. In the declarator (*pfi)(), the extra 
parentheses are necessary, as they are also in an expression, to indicate that indirection through a pointer 
to a function yields a function, which is then called; it returns an integer. 
As another example, 


float fal17], *afp[17]; 
declares an array of float numbers and an array of pointers to float numbers. Finally, 
static int x3d[3][51[7]; 


declares a static three-dimensional array of integers, with rank 3X5X7. In complete detail, x3d is an 
array of three items; each item is an array of five arrays; each of the latter arrays is an array of seven 
integers. Any of the expressions x3d, x3d[il], x3d[i](45], x3aLi][41]([k] may reasonably appear 
in an expression. The first three have type “‘array,”’ the last has type int. 


8.5 Structure, union and enumeration declarations 

A structure is an object consisting of a sequence of named members. Each member may have any 
type. A union is an object which may, at a given time, contain any one of several members. Structure 
and union specifiers have the same form. 


struct-or-union-specifier: 
struct-or-union { struct-decl-list } 
struct-or-union identifier { struct-decl-list } 
struct-or-union identifier 


struct-or-union: 
struct 
union 


The struct-decl-list is a sequence of declarations for the members of the structure or union: 


struct-decl-list: 
struct-declaration 
struct-declaration struct-decl-list 


struct-declaration: 
type-specifier struct-declarator-list ; 


struct-declarator-list: 
struct-declarator 
struct-declarator , struct-declarator-list 


In the usual case, a struct-declarator is just a declarator for a member of a structure or union. A struc- 
ture member may also consist of a specified number of bits. Such a member is also called a field; its 
length is set off from the field name by a colon. 


struct-declarator: 
declarator 
declarator ; constant-expression 
: constant-expression 


Within a structure, the objects declared have addresses which increase as their declarations are read left- 
to-right. Each non-field member of a structure begins on an addressing boundary appropriate to its type; 
therefore, there may be unnamed holes in a structure. Field members are packed into machine integers; 
they do not straddle words. A field which does not fit into the space remaining in a word is put into the 
next word. No ficld may be wider than a word. 
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Fields are assigned right-to-left onthe PDP-11 and VAX-11, left-to-right on other machines. 

A struct-declarator with no declarator, only a colon and a width, indicates an unnamed field useful 
for padding to conform to externally-imposed layouts. As a special case, an unnamed field with a width 
of 0 specifies alignment of the next field at a word boundary. The ‘‘next field’? presumably is a field, not 
an ordinary structure member, because in the latter case the alignment would have been automatic. 

The language does not restrict the types of things that are declared as fields, but implementations are 
not required to support any but integer fields. Moreover, even int fields may be considered to be 
unsigned. On the PDP-11, fields are not signed and have only integer values; on the VAX-11, fields 
declared with int are treated as containing a sign. For these reasons, it is strongly recommended that 
fields be declared as unsigned. In all implementations, there are no arrays of fields, and the address-of 
operator & may not be applied to them, so that there are no pointers to fields. 

A union may be thought of as a structure all of whose members begin at offset 0 and whose size is 
sufficient to contain any of its members. At most one of the members can be stored in a union at any 
time. 

A structure or union specifier of the second form, that is, one of 


struct identifier { struct-decl-list } 
union identifier { struct-decl-list } 


declares the identifier to be the structure tag (or union tag) of the structure specified by the list. A subse- 
quent declaration may then use the third form of specifier, one of 


struct identifier 
union identifier 


Structure tags allow definition of self-referential structures; they also permit the long part of the declara- 
tion to be given once and uscd several times. It is illegal to declare a structure or union which contains 
an instance of itself, but a structure or union may contain a pointer to an instance of itself, 

The names of members and tags do not conflict with each other or with ordinary variables. A partic- 
ular name may not be used twice in the same structure, but the same name may be uscd in several 
different structures in the same scope. 

A simple example of a structure declaration is 


struct tnode { 
char tword[20]; 
int ‘count; 
struct tnode «left; 
struct tnode «right; 
}; 
which contains an array of 20 characters, an integer, and two pointers to similar structures. Once this 
declaration has been given, the declaration 
struct tnode s, «sp; 


declares s to be a structure of the given sort and sp to be a pointer to a structure of the given sort. 
With these declarations, the expression 


sp->count 

refers to the count field of the structure to which sp points; 
s.left 

refers to the left subtree pointer of the structure s; and 
s.right->twordl0] 


refers to the first character of the tword member of the right subtree of s. 
Enumerations are unique types with named constants. However, the current language treats 
‘enumeration variables and constants as being of int type. 
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enum-specifier: 
enum { enum-list } 
enum identifier { enum-list } 
enum identifier 


enum-list: 
enumerator 
enum-list , enumerator 


enumerator: 
identifier 
identifier = constant-expression 


The identifiers in an enum-list are declared as constants, and may appear wherever constants are 
required. If no enumerators with = appear, then the values of the corresponding constants begin at 0 and 
increase by 1 as the declaration is read from left to right. An enumerator with = gives the associated 
identifier the value indicated; subsequent identifiers continue the progression from the assigned value. 

The names of enumerators in the same scope must all be distinct from cach other and from those of 
ordinary variables. 

The role of the identifier in the enum-specifier is entirely analogous to that of the structure tag in a 
struct-specifier; it names a particular enumeration. For example, 


enum color { chartreuse, burgundy, claret=10, winedark }; 


enum color *cp, col; 


col = claret; 
cp = &col; 
if (*ecp == burgundy) ... 


makes color the enumeration-tag of a type describing various colors, and then declares cp as a pointer 
to an object of that type, and col as an object of that type. The possible values are drawn from the set 
{0,1,10, 11}. 


8.6 Initialization 
A declarator may specify an initial value for the identifier being declared. The initializer is preceded 
by =, and consists of an expression or a list of values nested in braces. 


initializer: 


expression 
{ initializer-list } 
{ initializer-list , } 


initializer-list: 
expression 
initializer-list , initializer-list 
{ initializer-list } 


All the expressions in an initializer for a static or external variable must be constant expressions, 
which are described in §15, or expressions which reduce to the address of a previously declared variable, 
possibly offset by a constant expression. Automatic or register variables may be initialized by arbitrary 
expressions involving constants, and previously declared variables and functions. 

Static and external variables which are not initialized are guaranteed to start off as 0; automatic and 
register variables which are not initialized are guaranteed to start off as garbage. 

When an initializer applies to a scalar (a pointer or an object of arithmetic type), it consists of a sin- 


. gle expression, perhaps in braces. The initial value of the object is taken from the expression; the same 


conversions as for assignment are performed. 

When the declared variable is an aggregate (a structure or array) then the initializer consists of a 
brace-enclosed, comma-separated list of initializers for the members of the aggregate, written in increas- 
ing subscript or member order. If the aggregate contains subaggregates, this rule applies recursively to 
the members of the aggregate. If there are fewer initializers in the list than there are members of the 
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aggregate, then the aggregate is padded with 0’s. It is not permitted to initialize unions or automatic 
aggregates. 

Braces may be elided as follows. If the initializer begins with a left brace, then the succeeding 
comma-separated list of initializers initializes the members of the aggregate; it is erroneous for there to 
be more initializers than members. If, however, the initializer does not begin with a left brace, then only 
enough elements from the list are taken to account for the members of the aggregate; any remaining 
members are left to initialize the next member of the aggregate of which the current aggregate is a part. 

A final abbreviation allows a char array to be initialized by a string. In this case successive charac- 
ters of the string initialize the members of the array. 

For example, 


zee sc) Ged fags mag Meee ee gi 


declares and initializes x as a 1-dimensional array which has three members, since no size was specified 
and there are three initializers. 


float y[4][3] = { 
agar Feaeas args fe 
12 A Gee 
i 28, oa 
}; 
is a completely-bracketed initialization: 1, 3, and 5 initialize the first row of the array yl[01], namely 
ylol](ol, yfoJ[1], and y[0][21]. Likewise the next two lines initialize y£1] and yl2]. The ini- 
tializer ends early and therefore y[3] is initialized with 0. Precisely the same effect could have been 
achieved by 


fica vylaliisi-<s 4 
ate Sik, ie eee eg Gg eae 
}; 
The initializer for y begins with a left brace, but that for y[0] does not, therefore 3 elements from the 
list are used, Likewise the next three are taken successively for y[1] and y[2]. Also, 


float yl4](3] = { 
g otenl Pit aya ©. SY feat es i i as ae tea 


}; 
initializes the first column of y (regarded as a two-dimensional array) and leaves the rest 0. 
Finally, 
char msgl[] = "Syntax error on line %s\n"; 


shows a character array whose members are initialized with a string. 


8.7 Type names 

In two contexts (to specify type conversions explicitly by means of a cast, and as an argument of 
sizeof) it is desired to supply the name of a data type. This is accomplished using a “‘type name,” 
which in essence is a declaration for an object of that type which omits the name of the object. 


type-name: 
type-specifier abstract-declarator 


abstract-declarator: 
empty 
( abstract-declarator ) 
* abstract-declarator 
abstract-declarator ( ) 


abstract-declarator [ constant-expression ,, ] 


To avoid ambiguity, in the construction 
( abstract-declarator ) 


the abstract-declarator is required to be non-empty. Under this restriction, it is possible to identify 
uniquely the location in the abstract-declarator where the identifier would appear if the construction were 
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a declarator in a declaration. The named type is then the same as the type of the hypothetical identifier. 
For example, 

int 

Int + 

int «[3] 

int (*){3] 

int *() 

int-*)-C) 


79 ee 99 66 


name respectively the types ‘‘integer,”’ ‘‘pointer to integer,’’ ‘‘array of 3 pointers to integers,” ‘pointer 
to an array of 3 integers,’’ ‘function returning pointer to integer,” and “‘pointer to function returning an 
integer.” 


8.8 Typedef 
Declarations whose ‘‘storage class’’ is typedef do not define storage, but instead define identifiers 
which can be used later as if they were type keywords naming fundamental or derived types. 


typedef-name: 
identifier 


Within the scope of a declaration involving typedef, each identifier appearing as part of any declarator 
therein becomes syntactically cquivalent to the type keyword naming the type associated with the 
identifier in the way described in §8.4. For example, after 


typedef int MILES, *KLICKSP; 
typedef struct { double re, im;} complex; 


the constructions 


MILES distance; 
extern KLICKSP metricp; 
complex Z, #*ZDp; 


are all legal declarations; the type of distance is int, that of metricp is ‘pointer to int,”’ and that 
of z is the specified structure. zp is a pointer to such a structure. 

typedef does not introduce brand new types, only synonyms for types which could be specified in 
another way. Thus in the example above distance is considered to have exactly the same type as any 
other int object. 


9. STATEMENTS 
Except as indicated, statements are executed in sequence. 


9.1 Expression statement 
Most statements are expression statements, which have the form 


expression ; 


Usually expression statements are assignments or function calls. 


9.2 Compound statement, or block 
So that several statements can be used where one is expected, the compound statement (also, and 
equivalently, called ‘‘block’’) is provided: 


compound-statement: 

{ declaration-list,, statement-list , - } 
declaration-list: 

declaration 

declaration declaration-list 


statement-list: 
Statement 
statement statement-list 
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If any of the identifiers in the declaration-list were previously declared, the outer declaration is pushed 
down for the duration of the block, after which it resumes its force. 

Any initializations of auto or register variables are performed each time the block is entered at 
the top. It is currently possible (but a bad practice) to transfer into a block; in that case the initializations 
are not performed. Initializations of static variables are performed only once when the program 
begins execution. Inside a block, extern declarations do not reserve storage so initialization is not per- 
mitted. 


9.3 Conditional statement 
The two forms of the conditional statement are 


if ( expression ) statement 
if ( expression ) statement else statement 


In both cases the expression is evaluated and if it is non-zero, the first substatement is executed. In the 
second case the second substatement is executed if the expression is 0. As usual the ‘else’? ambiguity is 
resolved by connecting an else with the last encountered else-less if. 


9.4 While statement 
The while statement has the form 


while ( expression ) statement 


The substatement is executed repeatedly so long as the value of the expression remains non-zero. The 
test takes place before each execution of the statement. 


9.5 Do statement 
The do statement has the form 


do statement while ( expression ) 3; 


The substatement is executed repeatedly until the value of the expression becomes zero. The test takes 
place after each execution of the statement. 


9.6 For statement 
The for statement has the form 


for -{ expression-I ,,, 3 expression-2 ,,, 3; expression-3 , ie ) statement 
This statement is equivalent to 


expression-1 ; 

while (expression-2) { 
statement 
expression-3 ; 


} 


Thus the first expression specifies initialization for the loop; the second specifies a test, made before each 
iteration, such that the loop is exited when the expression becomes 0; the third expression often specifies 
an incrementing that is performed after cach iteration. 

Any or all of the expressions may be dropped. A missing expression-2 makes the implied while 
clause equivalent to while(1); other missing expressions are simply dropped from the expansion 
above. 


9.7 Switch statement 
The switch statement causes control to be transferred to one of several statements depending on 
the value of an expression. It has the form 


switch ( expression ) statement 


The usual arithmetic conversion is performed on the expression, but the result must be int. The state- 
ment is typically compound. Any statement within the statement may be labeled with onc or more case 
prefixes as follows: 


case constant-expression : 
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where the constant expression must be int. No two of the case constants in the same switch may have 
the same value. Constant expressions are precisely defined in §15. 
There may also be at most one statement prefix of the form 


default 


When the switch statement is executed, its expression is evaluated and compared with each case con- 
stant. If one of the case constants is equal to the value of the expression, control is passed to the state- 
ment following the matched case prefix. If no case constant matches the expression, and if there is a 
default prefix, control passes to the prefixed statement. If no case matches and if there is no 
default then none of the statements in the switch is executed. 

case and default prefixes in themselves do not alter the flow of control, which continues unim- 
peded across such prefixes. To exit from a switch, see break, §9.8. 

Usually the statement that is the subject of a switch is compound. Declarations may appear at the 
head of this statement, but initializations of automatic or register variables are ineffective. 


9.8 Break statement 
The statement 
break ; 
causes termination of the smallest enclosing while, do, for, or switch statement; control passes to 
the statement following the terminated statement. 
9.9 Continue statement 
The statement 
continue ; 


causes control to pass to the loop-continuation portion of the smallest enclosing while, do, or for 
statement; that is to the end of the loop. More precisely, in each of the statements 


Whiske Cnc.) = ¢ do { POL (een 8 
Conear :. 5 eontins 5 contin: 3 
} Fowhd Tee) } 


a continue is equivalent to goto contin. (Following the contin: is a null statement, §9.13.) 
9.10 Return statement 
A function returns to its caller by means of the return statement, which has one of the forms 


return ; 
return expression ; 


In the first case the returned value is undefined. In the second case, the value of the expression is 
returned to the caller of the function. If required, the expression is converted, as if by assignment, to the 
type of the function in which it appears. Flowing off the end of a function is equivalent to a return with 
no returned value. 


9.11 Goto statement 
Control may be transferred unconditionally by means of the statement 
goto identifier ; 
The identifier must be a label (§9.12) located in the current function. 
9.12 Labeled statement 
Any statement may be preceded by label prefixes of the form 
identifier : 


which serve to declare the identifier as a label. The only use of a label is as a target of a goto. The 
scope of a label is the current function, excluding any sub-blocks in which the same identifier has been 
redcclared. See §11. 
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9.13 Null statement 
The null statement has the form 


’ 


A null statement is useful to carry a label just before the } of a compound statement or to supply a null 
body to a looping statement such as while. 


10. EXTERNAL DEFINITIONS 

A C program consists of a sequence of external definitions. An external definition declares an 
identifier to have storage class extern (by default) or perhaps static, and a specified type. The 
type-specificr (§8.2) may also be empty, in which case the type is taken to be int. The scope of external 
definitions persists to the end of the file in which they are declared just as the effect of declarations per- 
sists to the end of a block. The syntax of external definitions is the same as that of all declarations, 
except that only at this level may the code for functions be given. 


10.1 External function definitions 
Function definitions have the form 


function-definition: 
decl-specifiers,, function-declarator function-body 


The only sc-specifiers allowed among the decl-specifiers are extern or static; see §11.2 for the dis- 
tinction between them. A function declarator is similar to a declarator for a “‘function returning ...”’ 
except that it lists the formal parameters of the function being defined. 


function-declarator: 
declarator ( parameter-list 


opt ) 
parameter-list: 

identifier 

identifier , parameter-list 


The function-body has the form 


function-body: 


declaration-list compound-statement 


The identifiers in the parameter list, and only those identifiers, may be declared in the declaration list. 
Any identifiers whose type is not given are taken to be int. The only storage class which may be 
specified is register; if it is specified, the corresponding actual parameter will be copied, if possible, 
into a register at the outset of the function. 

A simple example of a complete function definition is 


int max(a, b, c) 
It A Doss 


{ 
int m; 
nes Cat =D} ie a2 Ds 
returnat(mo> ¢)i-? wome)s 
} 


Here int is the type-specifier; max(a, b, c) is the function-declarator; int a, b, c; is the 
declaration-list for the formal parameters; { ... } is the block giving the code for the statement. 

C converts all float actual parameters to double, so formal parameters declared float have 
their declaration adjusted to read double. Also, since a reference to an array in any context (in particu- 
lar as an actual parameter) is taken to mean a pointer to the first element of the array, declarations of 

formal parameters declared “‘array of ...”’ are adjusted to read “‘pointer to...”’. 
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10.2 External data definitions 
An external data definition has the form 


data-definition: 
declaration 


The storage class of such data may be extern (which is the default) or static, but not auto or 
register. 


11. SCOPE RULES 

A C program need not all be compiled at the same time: the source text of the program may be kept 
in several files, and precompiled routines may be loaded from libraries. Communication among the func- 
tions of a program may be carried out both through explicit calls and through manipulation of external 
data. 

Therefore, there are two kinds of scope to consider: first, what may be called the lexical scope of an 
identifier, which is essentially the region of a program during which it may be used without drawing 
‘“tundefined identifier’? diagnostics; and second, the scope associated with external identifiers, which is 
characterized by the rule that references to the same external identifier are references to the same object. 


11.1 Lexical scope 

The lexical scope of identifiers declared in external definitions persists from the definition through 
the end of the source file in which they appear. The lexical scope of identifiers which are formal parame- 
ters persists through the function with which they are associated. The lexical scope of identifiers declared 
at the head of a block persists until the end of the block. The lexical scope of labels is the whole of the 
function in which they appear. 

In all cases, however, if an identifier is explicitly declared at the head of a block, including the block 
constituting a function, any declaration of that identifier outside the block is suspended until the end of 
the block. 

Remember also (§8.5) that identifiers associated with ordinary variables on the one hand and those 
associated with structure and union members and tags on the other form two disjoint classes which do 
not conflict. Members and tags follow the same scope rules as other identifiers. typedef names are in 
the same class as ordinary identifiers. They may be redeclared in inner blocks, but an explicit type must 
be given in the inner declaration: 


typedef float distance; 


{ 


auto int distance; 


The int must be present in the second declaration, or it would be taken to be a declaration with no 
declarators and type distance*. 


11.2 Scope of externals 

If a function refers to an identifier declared to be extern, then somewhere among the files or 
libraries constituting the complete program there must be an external definition for the identifier, All 
functions in a given program which refer to the same external identificr refer to the same object, so care 
must be taken that the type and size specified in the definition are compatible with those specified by cach 
function which references the data. 

The appearance of the extern keyword in an external definition indicates that storage for the 
identifiers being declared will be allocated in another file. Thus in a multi-file program, an external data 
definition without the extern specifier must appear in exactly one of the files. Any other files which 
wish to give an external definition for the identifier must include the extern in the definition. The 
identifier can be initialized only in the declaration where storage is allocated. 

Identifiers declared static at the top level in external definitions are not visible in other files. 


. Functions may be declared static. 


* It is agreed that the ice is thin here. 
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12. COMPILER CONTROL LINES 

The C compiler contains a preprocessor capable of macro substitution, conditional compilation, and 
inclusion of named files. Lines beginning with # communicate with this preprocessor. These lines have 
syntax independent of the rest of the language; they may appear anywhere and have effect which lasts 
(independent of scope) until the end of the source program file. 


12.1 Token replacement 
A compiler-control line of the form 


#define identifier token-string 


causes the preprocessor to replace subsequent instances of the identifier with the given string of tokens. 
Semicolons in, or at the end of, the token-string are part of that string. A line of the form 


#define identifier( identifier , ... , identifier ) token-string 


where there is no space between the first identifier and the (, is a macro definition with arguments. Sub- 
sequent instances of the first identifier followed by a (, a sequence of tokens delimited by commas, and a 
) are replaced by the token string in the definition. Each occurrence of an identifier mentioned in the 
formal parameter list of the definition is replaced by the corresponding token string from the call. The 
actual arguments in the call are token strings separated by commas; however commas in quoted strings or 
protected by parentheses do not separate arguments. The number of formal and actual parameters must 
be the same. Strings and character constants in the token-string are scanned for formal parameters, but 
strings and character constants in the rest of the program are not scanned for defined identifiers. to 
replacement. 

In both forms the replacement string is rescanned for more defined identifiers. In both forms a long 
definition may be continued on another line by writing \ at the end of the line to be continued. 

This facility is most valuable for definition of ‘‘manifest constants,’’ as in 


#define TABSIZE 100 
int table[TABSIZE]; 
A control line of the form 
#undef identifier 
causes the identifier’s preprocessor definition to be forgotten. 
12.2 File inclusion 
A compiler control line of the form 
#include "filename" 


causes the replacement of that line by the entire contents of the file filename. The named file is searched 
for first in the directory of the original source file, and then in a sequence of specified or standard places. 
Alternatively, a control line of the form 


#include <filename > 


searches only the specified or standard places, and not the directory of the source file. (How the places 
are specified is not part of the language.) 
#include’s may be nested. 


12.3 Conditional compilation 
A compiler control line of the form 
#if constant-expression 


checks whether the constant expression evaluates to non-zero. (Constant expressions are discussed in 
§15; the following additional restriction applies here: the constant expression may not contain sizeof or 
“an enumeration constant.) A control line of the form 


#ifdef identifier 


checks whether the identifier is currently defined in the preprocessor; that is, whether it has been the 
subject of a #define control line. A control line of the form 
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#ifndet identifier 


checks whether the identifier is currently undefined in the preprocessor. 
All three forms are followed by an arbitrary number of lines, possibly containing a control line 


#else 
and then by a control line 
#endif 


If the checked condition is true then any lines between #e1se and #endif are ignored. If the checked 
condition is false then any lines between the test and an #e1se or, lacking an #else, the #endif, are 
ignored. 

These constructions may be nested. 


12.4 Line control 
For the benefit of other preprocessors which generate C programs, a line of the form 


#1line constant "filename" 


causes the compiler to believe, for purposes of error diagnostics, that the line number of the next source 
line is given by the constant and the current input file is named by the identifier. If the identifier is 
absent the remembered file name does not change. 


13. IMPLICIT DECLARATIONS 

It is not always necessary to specify both the storage class and the type of identifiers in a declaration. 
The storage class is supplied by the context in external definitions and in declarations of formal parame- 
ters and structure members. In a declaration inside a function, if a storage class but no type is given, the 
identifier is assumed to be int; if a type but no storage class is indicated, the identifier is assumed to be 
auto. An exception to the latter rule is made for functions, because auto functions do not exist. If 
the type of an identifier is ‘‘function returning ...”’, it is implicitly declared to be extern. 

In an expression, an identifier followed by ( and not already declared is contextually declared to be 
“function returning int”’. 


14. TYPES REVISITED 
This section summarizes the operations which can be performed on objects of certain types. 


14.1 Structures and unions 
Structures and unions may be assigned, passed as arguments to functions, and returned by functions. 
Other plausible operators, such as equality comparison and structure casts, are not implemented. 

In a reference to a structure or union member, the name on the right must specify a member of the 
aggregate named or pointed to by the expression on the left. In general, a member of a union may not 
be inspected unless the value of the union has been assigned using that same member. However, one 
special guarantee is made by the language in order to simplify the use of unions: if a union contains 
several structures that share a common initial sequence, and if the union currently contains one of these 
structures, it is permitted to inspect the common initial part of any of the contained structures. For 
example, the following is a legal fragment: 


24 C Reference Manual 


union { 
struct { 
int type; 
Tihs 
struct { 
int type; 
int intnode; 
bes 3 foe 
struct { 
int type; 
float floatnode; 


u.nf.type = FLOAT; 
u.n£f.floatnode = 3.14; 


if (u.n.type == FLOAT) 
sin(u.nf.floatnode) ... 


14.2 Functions 

There are only two things that can be done with a function: call it, or take its address. If the name 
of a function appears in an expression not in the function-name position of a call, a pointer to the func- 
tion is generated. Thus, to pass one function to another, one might say 


SRE EC YS 
g(f£); 

Then the definition of g might read 
g(funep) 


int (*funcp)(); 
{ 


(*funcp)(); 


} 


Notice that £ must be declared explicitly in the calling routine since its appearance in g(£) was not fol- 
lowed by (. 


14.3. Arrays, pointers, and subscripting 

Every time an identifier of array type appears in an expression, it is converted into a pointer to the 
first member of the array. Because of this conversion, arrays are not lvalues. By definition, the subscript 
operator [ ] is interpreted in such a way that E1[E2] is identical to *((E1)+(E2)). Because of the 
conversion rules which apply to +, if E1 is an array and E2 an integer, then E1[E2] refers to the E2-th 
member of E1. Therefore, despite its asymmetric appearance, subscripting is a commutative operation. 

A consistent rule is followed in the case of multi-dimensional arrays. If E is an n-dimensional array 
of rank iX/X-- - Xk, then E appearing in an expression is converted to a pointer to an (n—1)-dimensional 
array with rank jX--- Xk. If the * operator, either explicitly or implicitly as a result of subscripting, is 
applied to this pointer, the result is the pointed-to (n—1)-dimensional array, which itself is immediately 
converted into a pointer. 

For example, consider 


int x[3](5]; 


Here x is a 3X5 array of integers. When x appears in an expression, it is converted to a pointer to (the 
first of three) 5-membered arrays of integers. In the expression x[i], which is equivalent to *(x+i), 
x is first converted to a pointer as described; then i is converted to the type of x, which involves multi- 
plying i by the length the object to which the pointer points, namely 5 integer objects. The results are 
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added and indirection applied to yield an array (of 5 integers) which in turn is converted to a pointer to 
the first of the integers. If there is another subscript the same argument applies again; this time the 
result is an integer. 

It follows from all this that arrays in C are stored row-wise (last subscript varies fastest) and that the 
first subscript in the declaration helps determine the amount of storage consumed by an array but plays 
no other part in subscript calculations. 


14.4 Explicit pointer conversions 

Certain conversions involving pointers are permitted but have implementation-dependent aspects. 
They are all specified by means of an explicit type-conversion operator, §§7.2 and 8.7. 

A pointer may be converted to any of the integral types large enough to hold it. Whether an int or 
long is required is machine dependent. The mapping function is also machine dependent, but is 
intended to be unsurprising to those who know the addressing structure of the machine. Details for 
some particular machines are given below. 

An object of integral type may be explicitly converted to a pointer. The mapping always carries an 
integer converted from a pointer back to the same pointer, but is otherwise machine dependent. 

A pointer to one type may be converted to a pointer to another type. The resulting pointer may 
cause addressing exceptions upon use if the subject pointer does not refer to an object suitably aligned in 
storage. It is guaranteed that a pointer to an object of a given size may be converted to a pointer to an 
object of a smaller size and back again without change. 

For example, a storage-allocation routine might accept a size (in bytes) of an object to allocate, and 
return a char pointer; it might be used in this way. 


extern char *alloc(); 
double xdp; 


dp = (double *) alloc(sizeof(double) ); 
ol » wae Sy ae oe 8 


alloc must ensure (in a machine-dependent way) that its return value is suitable for conversion to a 
pointer to double; then the use of the function is portable. 

The pointer representation on the PDP-11 corresponds to a 16-bit integer and measures bytes. 
chars have no alignment requirements; everything else must have an even address. 

On the VAX-11, pointers are 32 bits long and measure bytes. Elementary objects are aligned on a 
boundary equal to their length, except that double quantitics need be aligned only on even 4-byte 
boundaries. Aggregates are aligned on the strictest boundary required by any of their constituents. 

On the Honeywell 6000, a pointer corresponds to a 36-bit integer; the word part is in the left 18 bits, 
and the two bits that select the character in a word lie just to their right. Thus char pointers measure 
units of 216 bytes; everything else is measured in units of 2!8 machine words. double quantities and 
aggregates containing them must lie on an even word address (0 mod 21%). 

The IBM 370 and the Interdata 8/32 are similar. On each, pointers are 32-bit quantities that measure 
bytes; elementary objects are aligned on a boundary equal to their length, so pointers to short must be 
0 mod 2, to int and float 0 mod 4, and to double 0 mod 8. Aggregates are aligned on the strictest 
boundary required by any of their constituents. 


15. CONSTANT EXPRESSIONS 

In several places C requires expressions which evaluate to a constant: after case, as array bounds, 
and in initializers. In the first two cases, the expression can involve only integer constants, character con- 
stants, enumeration constants, and sizeof expressions, possibly connected by the binary operators 


eee e tpeey Gens Sowa omg eee ee ot ee fee acs See Ce 


or by the unary operators 


_or by the ternary operator 


ree 


* 


Parentheses can be used for grouping, but not for function calls. 

More latitude is permitted for initializers; besides constant expressions as discussed above, one can 
also apply the unary & operator to external or static objects, and to external or static arrays subscripted 
with a constant expression. The unary & can also be applied implicitly by appearance of unsubscripted 
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arrays and functions. The basic rule is that initializers must evaluate either to a constant or to the 
address of a previously declared external or static object plus or minus a constant. 

Less latitude is allowed for constant expressions after #if; sizeof expressions and enumeration 
constants are not permitted. 


16. PORTABILITY CONSIDERATIONS 

Certain parts of C are inherently machine dependent. The following list of potential trouble spots is 
not meant to be all-inclusive, but to point out the main ones. 

Purely hardware issues like word size and the properties of floating point arithmetic and integer divi- 
sion have proven in practice to be not much of a problem. Other facets of the hardware are reflected in 
differing implementations. Some of these, particularly sign extension (converting a negative character 
into a negative integer) and the order in which bytes are placed in a word, are a nuisance that must be 
carefully watched. Most of the others are only minor problems. 

The number of register variables that can actually be placed in registers varies from machine to 
machine, as does the set of valid types. Nonetheless, the compilers all do things properly for their own 
machine; excess or invalid register declarations are ignored. 

Some difficulties arise only when dubious coding practices are used. It is exceedingly unwise to write 
programs that depend on any of these properties. 

The order of evaluation of function arguments is not specified by the language. It is right to left on 
the PDP-11 and VAX-11, left to right on the others. The order in which side effects take place is also 
unspecified. 

Since character constants are really objects of type int, multi-character character constants may be 
permitted. The specific implementation is very machine dependent, however, because the order in which 
characters are assigned to a word varies from one machine to another. 

Fields are assigned to words and characters to integers right-to-left on the PDP-11 and VAX-11 and 
left-to-right on other machines. These differences are invisible to isolated programs which do not indulge 
in type punning (for example, by converting an int pointer to a char pointer and inspecting the 
pointed-to storage), but must be accounted for when conforming to externally-imposed storage layouts. 

The language accepted by the various compilers differs in minor details. Most notably, the current 
PDP-11 compiler will not initialize structures containing bit-fields, and does not accept a few assignment 
operators in certain contexts where the value of the assignment is used. 


17. ANACHRONISMS 

Because C is an evolving language, certain obsolete constructions may be found in older programs. 
Although some versions of the compiler support such anachronisms, they have by and large disappeared, 
leaving only a portability problem behind. 

Earlier versions of C used the form =op instead of op= for assignment operators. This leads to 
ambiguities, typified by 


x=-1 


which assigns - 1 to x, but previously decremented x. 
The syntax of initializers has changed: previously, the equals sign that introduces an initializer was 
not present, so instead of 


heck ape Sls 1 

one used 
Hey ii ae He 

The change was made because the initialization 
5h 0 ol gy al ls 


resembles a function declaration closely enough to confuse the compilers. 

A structure or union member reference is a chain of member references (qualifications) that are 
* prefixed by cither a pointer to a structure or union or a structure or union proper. Because cach 
qualification implies the addition of an offset within an address computation, older compilers (which 
failed to check for membership in the appropriate structure or union) allowed omission of those 
qualifications with an offset of zero. Complete qualification is now required. 

Previous versions of the compiler were lax in detecting mixed assignments involving pointers and 
arithmetic quantities. These are now remarked upon. 
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18. SYNTAX SUMMARY 
This summary of C syntax is intended more for aiding comprehension than as an exact statement of 
the language. 


18.1 Expressions 
The basic expressions are: 


expression: 

primary 

* expression 

& lvalue 

= expression 

! expression 

~ expression 

++ lvalue 

-- lvalue 

lvalue ++ 

lvalue -- 

sizeof expression 

( type-name ) expression 
expression binop expression 
expression ? expression : expression 
lvalue asgnop expression 
expression , expression 


primary: 
identifier 
constant 
string 
( expression ) 
primary ( expression-list ) 
primary [ expression ] 
primary . identifier 
primary - > identifier 


lvalue: 
identifier 
primary [ expression ] 
value . identifier 
primary -> identifier 
* expression 
( lvalue ) 


The primary-expression operators 
()* bc dos 
have highest priority and group left-to-right. The unary operators 


eee ce ++ -- sizeof ( type-name ) 


have priority below the primary operators but higher than any binary operator, and group right-to-left. 
Binary operators group left-to-right; they have priority decreasing as indicated below. The conditional 
operator groups right to left. 
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binop: 
* wh % 
+ = 
>> << 
< > <= >= 
& 
H 
&& 
it 
3 


Assignment operators all have the same priority, and all group right-to-left. 


asgnop: 
= += -= *= f= Ge >>= <<= &= 


The comma operator has the lowest priority, and groups left-to-right. 


18.2 Declarations 


declaration: 
decl-specifiers init-declarator-list,,, = 


decl-specifiers: 
type-specifier decl-specifiers,, 
se-specifier decl-specifiers,, 


se-specifier: 
auto 
static 
extern 
register 
typedef 


type-specifier: 
char 
short 
int 
long 
unsigned 
float 
double 
void 
struct-or-union-specifier 
typedef-name 
enum-specifier 


enum-specifier: 
enum { enum-list } 
enum identifier { enum-list } 
enum identifier 


enum-list: 
enumerator 
enum-list , enumerator 
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enumerator: 
identifier 
identifier = constant-expression 


init-declarator-list: 
init-declarator 
init-declarator , init-declarator-list 


init-declarator: 
declarator initializer a 
declarator: 
identifier 
( declarator ) 
* declarator 
declarator ( ) 
declarator ( constant-expression.,, ] 
struct-or-union-specifier: 
struct { struct-decl-list } 
struct identifier { struct-decl-list } 
struct identifier 
union { struct-decl-list } 
union identifier { struct-decl-list } 
union identifier 


struct-decl-list: 
struct-declaration 
struct-declaration struct-decl-list 


struct-declaration: 
type-specifier struct-declarator-list ; 


struct-declarator-list: 
struct-declarator 
struct-declarator , struct-declarator-list 


struct-declarator: 
declarator 
declarator : constant-expression 
: constant-expression 


initializer: 


expression 
{ initializer-list } 
{ initializer-list , } 


initializer-list; 
expression 
initializer-list , initializer-list 
{ initializer-list } 


type-name: 
type-specifier abstract-declarator 
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abstract-declarator: 

empty 

( abstract-declarator ) 

* abstract-declarator 

abstract-declarator ( ) 

abstract-declarator [ constant-expression., > ] 
typedef-name: 

identifier 


18.3 Statements 


compound-statement: 
{ declaration-list,, statement-list,, } 


declaration-list: 
declaration 
declaration declaration-list 


statement-list: 
statement 
statement statement-list 


statement; 
compound-statement 
expression ; 
if ( expression ) statement 
if ( expression ) statement else statement 
while ( expression ) statement 
do statement while ( expression ) ; 
Fox «( expression-I ,., ; expression-2 , a expression-3 , x ) Statement 
switch ( expression ) statement 
case constant-expression : statement 
default :; statement 
break ; 
continue ; 
return }; 
return expression ; 
goto identifier ; 
identifier : statement 


’ 


18.4 External definitions 


program: 
external-definition 
external-definition program 


external-definition: 
function-definition 
data-definition 


function-definition: 
type-specifier,, = function-declarator function-body 


function-declarator: 


declarator ( parameter-list 


on ) 


C Reference Manual 


parameter-list: 
identifier 
identifier , parameter-list 


function-body: 


declaration-list compound-statement 


data-definition: 


extern,, type-specifier,, init-declarator-list,, 3 
static,, mpe-specifier,, init-declarator-list,, 3 


18.5 Preprocessor 


#define identifier token-string 
#define identifier ( identifier , 
#undef identifier 

#include "filename " 
#include <filename > 

#if constant-expression 

#ifde€f identifier 

#ifndef identifier 

#else 

#endif 

#1line constant "filename " 


January 1981 


, identifier ) token-string 
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